dailybuzz.cc - data-scienceProbability and Statistics

Bayesian Statistics: From Concept to Data Analysis

starstarstarstarstar_half

This course introduces the Bayesian approach to statistics, starting with the concept of probability and moving to the analysis of data. We will learn about the philosophy of the Bayesian approach as well as how to implement it for common types of data. We will compare the Bayesian approach to the more commonly-taught Frequentist approach, and see some of the benefits of the Bayesian approach. In particular, the Bayesian approach allows for better accounting of uncertainty, results that have more intuitive and interpretable meaning, and more explicit statements of assumptions. This course combines lecture videos, computer demonstrations, readings, exercises, and discussion boards to create an active learning experience. For computing, you have the choice of using Microsoft Excel or the open-source, freely available statistical package R, with equivalent content for both options. The lectures provide some of the basic mathematical development as well as explanations of philosophy and interpretation. Completion of this course will give you an understanding of the concepts of the Bayesian approach, understanding the key differences between Bayesian and Frequentist approaches, and the ability to do basic data analyses.

Fitting Statistical Models to Data with Python

starstarstarstarstar_border

In this course, we will expand our exploration of statistical inference techniques by focusing on the science and art of fitting statistical models to data. We will build on the concepts presented in the Statistical Inference course (Course 2) to emphasize the importance of connecting research questions to our data analysis methods. We will also focus on various modeling objectives, including making inference about relationships between variables and generating predictions for future observations. This course will introduce and explore various statistical modeling techniques, including linear regression, logistic regression, generalized linear models, hierarchical and mixed effects (or multilevel) models, and Bayesian inference techniques. All techniques will be illustrated using a variety of real data sets, and the course will emphasize different modeling approaches for different types of data sets, depending on the study design underlying the data (referring back to Course 1, Understanding and Visualizing Data with Python). During these lab-based sessions, learners will work through tutorials focusing on specific case studies to help solidify the week’s statistical concepts, which will include further deep dives into Python libraries including Statsmodels, Pandas, and Seaborn. This course utilizes the Jupyter Notebook environment within Coursera.

Random Models, Nested and Split-plot Designs

starstarstarstarstar_half

Many experiments involve factors whose levels are chosen at random. A well-know situation is the study of measurement systems to determine their capability. This course presents the design and analysis of these types of experiments, including modern methods for estimating the components of variability in these systems. The course also covers experiments with nested factors, and experiments with hard-to-change factors that require split-plot designs. We also provide an overview of designs for experiments with response distributions from nonnormal response distributions and experiments with covariates.

Data – What It Is, What We Can Do With It

starstarstarstarstar_half

This course introduces students to data and statistics. By the end of the course, students should be able to interpret descriptive statistics, causal analyses and visualizations to draw meaningful insights. The course first introduces a framework for thinking about the various purposes of statistical analysis. We’ll talk about how analysts use data for descriptive, causal and predictive inference. We’ll then cover how to develop a research study for causal analysis, compute and interpret descriptive statistics and design effective visualizations. The course will help you to become a thoughtful and critical consumer of analytics. If you are in a field that increasingly relies on data-driven decision making, but you feel unequipped to interpret and evaluate data, this course will help you develop these fundamental tools of data literacy.

A Crash Course in Causality: Inferring Causal Effects from Observational Data

starstarstarstarstar_half

We have all heard the phrase “correlation does not equal causation.” What, then, does equal causation? This course aims to answer that question and more! Over a period of 5 weeks, you will learn how causal effects are defined, what assumptions about your data and models are necessary, and how to implement and interpret some popular statistical methods. Learners will have the opportunity to apply these methods to example data in R (free statistical software environment). At the end of the course, learners should be able to: 1. Define causal effects using potential outcomes 2. Describe the difference between association and causation 3. Express assumptions with causal graphs 4. Implement several types of causal inference methods (e.g. matching, instrumental variables, inverse probability of treatment weighting) 5. Identify which causal assumptions are necessary for each type of statistical method So join us.... and discover for yourself why modern statistical methods for estimating causal effects are indispensable in so many fields of study!

Advanced Linear Models for Data Science 1: Least Squares

starstarstarstarstar_border

Welcome to the Advanced Linear Models for Data Science Class 1: Least Squares. This class is an introduction to least squares from a linear algebraic and mathematical perspective. Before beginning the class make sure that you have the following: - A basic understanding of linear algebra and multivariate calculus. - A basic understanding of statistics and regression models. - At least a little familiarity with proof based mathematics. - Basic knowledge of the R programming language. After taking this course, students will have a firm foundation in a linear algebraic treatment of regression modeling. This will greatly augment applied data scientists' general understanding of regression models.

Power and Sample Size for Multilevel and Longitudinal Study Designs

starstarstarstarstar_border

Power and Sample Size for Longitudinal and Multilevel Study Designs, a five-week, fully online course covers innovative, research-based power and sample size methods, and software for multilevel and longitudinal studies. The power and sample size methods and software taught in this course can be used for any health-related, or more generally, social science-related (e.g., educational research) application. All examples in the course videos are from real-world studies on behavioral and social science employing multilevel and longitudinal designs. The course philosophy is to focus on the conceptual knowledge to conduct power and sample size methods. The goal of the course is to teach and disseminate methods for accurate sample size choice, and ultimately, the creation of a power/sample size analysis for a relevant research study in your professional context. Power and sample size selection is one of the most important ethical questions researchers face. Interventional studies that are too large expose human volunteer research participants to possible, and needless, harm from research. Interventional studies that are too small will fail to reach their scientific objective, again bringing possible harm to research participants, without the possibility of concomitant gain from the increase in knowledge. For observational studies in which there are no possible harms to the participants, such as observational studies, proper power ensures good stewardship of both time and money. Most National Institutes of Health (NIH) study sections will only fund a grant if the grantee has written a compelling and accurate power and sample size analysis. The Institute of Education Sciences (IES), the statistics, research, and evaluation arm of the U.S. Department of Education, also offers competitive grants requiring a compelling and accurate power and sample size analysis (Goal 3: Efficacy and Replication and Goal 4: Effectiveness/Scale-Up). At the end of the online course, learners will be able to: • Use a framework and strategy for study planning • Write study aims as testable hypotheses • Describe a longitudinal and multilevel study design • Write a statistical analysis plan • Plan a sampling design for subgroups, e.g. racial and ethnic • Demonstrate the feasibility of recruitment • Describe expected missing data and dropout • Write a power and sample size analysis that is aligned with the planned statistical analysis This is a five-week intensive and interactive online course. We will use a mix of instructional videos, software demonstration videos, online discussion forums, online readings, quizzes, exercise assignments, and peer-review assignments. The final course project is a peer-reviewed research study you design for future power or sample size analysis.

Statistics with R Capstone

starstarstarstarstar_half

The capstone project will be an analysis using R that answers a specific scientific/business question provided by the course team. A large and complex dataset will be provided to learners and the analysis will require the application of a variety of methods and techniques introduced in the previous courses, including exploratory data analysis through data visualization and numerical summaries, statistical inference, and modeling as well as interpretations of these results in the context of the data and the research question. The analysis will implement both frequentist and Bayesian techniques and discuss in context of the data how these two approaches are similar and different, and what these differences mean for conclusions that can be drawn from the data. A sampling of the final projects will be featured on the Duke Statistical Science department website. Note: Only learners who have passed the four previous courses in the specialization are eligible to take the Capstone.

Basic Statistics

starstarstarstarstar_half

Understanding statistics is essential to understand research in the social and behavioral sciences. In this course you will learn the basics of statistics; not just how to calculate them, but also how to evaluate them. This course will also prepare you for the next course in the specialization - the course Inferential Statistics. In the first part of the course we will discuss methods of descriptive statistics. You will learn what cases and variables are and how you can compute measures of central tendency (mean, median and mode) and dispersion (standard deviation and variance). Next, we discuss how to assess relationships between variables, and we introduce the concepts correlation and regression. The second part of the course is concerned with the basics of probability: calculating probabilities, probability distributions and sampling distributions. You need to know about these things in order to understand how inferential statistics work. The third part of the course consists of an introduction to methods of inferential statistics - methods that help us decide whether the patterns we see in our data are strong enough to draw conclusions about the underlying population we are interested in. We will discuss confidence intervals and significance tests. You will not only learn about all these statistical concepts, you will also be trained to calculate and generate these statistics yourself using freely available statistical software.

Factorial and Fractional Factorial Designs

starstarstarstarstar_half

Many experiments in engineering, science and business involve several factors. This course is an introduction to these types of multifactor experiments. The appropriate experimental strategy for these situations is based on the factorial design, a type of experiment where factors are varied together. This course focuses on designing these types of experiments and on using the ANOVA for analyzing the resulting data. These types of experiments often include nuisance factors, and the blocking principle can be used in factorial designs to handle these situations. As the number of factors of interest grows full factorials become too expensive and fractional versions of the factorial design are useful. This course will cover the benefits of fractional factorials, along with methods for constructing and analyzing the data from these experiments.

Prev 1 2 3 Next

FilterApply FilterReset Filter

Filter